A data-driven short video international communication model based on indicator system communication network and attention BiLSTM neural network

As one of the most popular ways to disseminate Internet content in the current era of media convergence, short videos have increasingly become part of people’s daily lives. This paper takes TikTok as the research object, focusing on the study of short video communication process. The research objective of this paper is to improve the international communication effect of short videos through data-driven analysis and strategy research on the law of international communication of short videos. Firstly, the key objective indicator system about international communication of short videos is constructed, the definition, selection and design logic of key objective indicators are introduced, the construction idea of the corresponding indicator system is proposed, and then the key objective indicator system about international communication of short videos and related data set are constructed. Secondly, an international communication network model for short videos is built. The construction of this network model is divided into three steps, the construction of the short video communication network KOIS-PN based on the key objective indicators system, the construction of AT-BiLSTM neural network model to adapt to the research scenario of short video international communication, the construction of an international communication network model of short videos combining KOIS-PN and AT-BiLSTM. The international short video communication network is constructed according to the key objective index system of international short video, and the key objective index system is used as nodes in the communication network, and the short video communication network is constructed by hierarchical relationship. Thirdly, the performance of the core theoretical model proposed in this paper is verified. The experimental results show that, the KOISPN-ATBiLSTM international communication network model of short videos has certain validity and advancement.


Related research
Among the communication-related studies of short videos, one major category is the short video communication studies based on humanities and social sciences, and another major category is the short video communication studies based on natural sciences 3 .Zhao 4 attributed TikTok's success to the platform's recommendation algorithm.The study noted that TikTok's recommendation algorithm is jointly determined by video content and user behavior 5 , and the algorithm then feeds back to the videos and users, tagging them, and there is a connection between video tags and user tags.At the same time, TikTok has become a mainstream video media platform, the recommendation algorithm has been polished by billions of data, and the various data of the video is able to reflect the user characteristics and propagation laws accurately, which can provide a data base for the study of international communication of short video propagation.
In the field related to short video communication research based on humanities and social sciences, in Zhou 6 based on the popular video content of Chinese culture theme released by folk subjects on TikTok platform, studied the current situation of international communication of Chinese culture in overseas social media platforms by the methods of case study and content analysis, made feature analysis of video content and analyzed the feedback effect of audience on video content in terms of awareness, emotion and behavior, and finally proposed targeted recommendations for the communication of Chinese culture on overseas social media platforms.In the same year, Zhang 7 studied the influence of short video information dissemination in the new media era, taking the official accounts of news media organizations as research cases, crawled the user information data of short video news corresponding to the numbers of likes, comments, plays, pop-ups, set weights from three aspects, such as the click index, interaction index and utilization index, and constructed the short video communication influence index, measured the information dissemination influence of short video news by integrating multiple dimensions such as video broadcast length, video quantity, video content and production form, and analyzed the influence factors of its communication effect.
Among the studies related to short video communication based on natural science, Zhong 8 modeled the popularity of online short videos on the TikTok platform, and the study pointed out that there are two typical communication patterns of single-gradient and multi-gradient for the cumulative plays of short videos, and constructed a like-communication dynamics model, and finally also analyzed the relationship between other factors and plays, which is a set of short video communication analysis methods based on communication dynamics.There are few algorithm studies directly for short video communication, but some scholars have made valuable studies in the field of communication research by applying the related algorithms involved in this study.Dang 9 conducted a study on the algorithm of the social network information dissemination based on neural network, reconstructed the social network information dissemination model, solved the problem about constructing the state transfer equation of the traditional prediction model, optimized the social network information dissemination prediction by using algorithms in the field of neural network, proposed the algorithm of the social network information dissemination prediction based on LSTM, and established the framework of the social network information dissemination model based on LSTM to predict information dissemination process.The SEIR model is a mathematical model used to describe the spread of infectious diseases.It classifies populations into four categories: susceptible, exposed, infectious, and recovered.The SEIR model uses a set of differential equations to describe the dynamic transitions between these different categories.Liu 10 applied deep learning methods to predict positive cases reported in Wuhan and four states in USA, and applied recurrent neural networks based on long-term and short-term memory and their bidirectional LSTM, stacked LSTM and traditional SEIR models to the Wuhan dataset to compare and select the best model for the positive case prediction task, and the results showed that the LSTM-based model performed significantly better than the traditional SEIR model.This paper argues that for short video platforms, objective indicators can more accurately reflect some of the subjective factors that are difficult to specify.The short video platform must design recommendation algorithms with maximizing benefits of the platform as the starting point, such an algorithm must accurately grasp characteristics of the user in order to expand audiences and improve conversion, and the platform's algorithms have been polished with billions of data, so these objective indicators are reliable enough to serve as nodes of the communication network and provide support for the research results.
At present, there are few related studies on the evaluation indicators of short videos at home and abroad.In the field of research on evaluation indicators of short videos, objective indicators and subjective indicators are usually fused to construct an indicator evaluation system.Yu 11 constructed a government short video dissemination effect evaluation indicator system to provide a theoretical basis for the development of government short videos.This study used the Delphi method to conduct questionnaire surveys, scored the importance and availability of indicators, used the analytic hierarchy process to weight the government short video dissemination effect evaluation indicator system, constructed a government short video dissemination effect evaluation indicator system, and selected research samples to verify the feasibility, stability and persistence of the indicator system, and finally formulated strategies to improve the dissemination effect of government short videos.Nazarov 4 conducted a study on TikTok network toolkits, determined the main factors affecting the content quality of personal accounts, established a fuzzy evaluation model to evaluate account content, and performed qualitative and quantitative analysis of the evaluation results.In the same year, Wang 12 analyzed the first-level and second-level indicators of the short video network public opinion early warning indicator system under information epidemic by combining literature research with field research.Through a combination of analytic hierarchy and expert survey methods, they calculated the weights of various levels of indicators, and finally verified the scientificity and applicability of the information epidemic under short video network public opinion early warning indicator system constructed through empirical research.
While in the research of short video dissemination in reality, subjective indicators are difficult to verify their certainty, or the research work on determining subjective indicators is often large, and in this era of rapid change and high-speed development, the certainty of these subjective indicators changes over time.Therefore, the evaluation results of studies containing subjective indicators are also unclear and unstable.This paper believes that for short video platforms, objective indicators can more accurately reflect some difficult-to-clarify subjective factors, and short video platforms must design recommendation algorithms with maximizing platform benefits as the starting point, such algorithms must accurately grasp user characteristics in order to expand the audience and improve conversion, and the algorithm of the platform has been polished by hundreds of millions of data, so these objective indicators have enough reliability to serve as nodes of the communication network to provide support for research results.At the same time, the problem to be solved in this paper is how to improve the international dissemination effect of short videos, under this big problem, this paper separates its objective part which is to improve the international dissemination width of short videos, and conducts research on this problem.
RNN is a special internal neural network with self-connection, and the fields in which RNN and its derived models are applied are very numerous.Any problems that need to consider the temporal order and have time characteristics can be solved by using RNN and its derived models.The main drawback of RNN models is the problem of forgetting and vanishing gradient 13 .The methods to improve the forgetting problem of RNN models are 14 : (1) Changing the model structure of RNN into a more advanced model, such as LSTM, GRU, Bi-LSTM and so on; (2) combining other modules to overcome the forgetting problem, such as introducing attention mechanism.LSTM is the most common variant of RNN, and the core idea is that the concept of "gate" is introduced 15 .The problem solved by Bi-LSTM is that the unidirectional LSTM can only remember information before the current moment, and lacks grasp of information after the current moment.With Bi-LSTM, the model has the ability to save information in both directions of history and future.At present, Bi-LSTM and its derived models have been widely used in time series prediction scenarios 16 .In dealing with time series problems, introducing Self-Attention module in RNN based encoder-decoder model can overcome the "forgetting problem" of original model, and how to introduce self-attention into LSTM is a research hotspot in recent years.In the international communication short video dissemination model designed in this paper, BiLSTM model combined with Attention is used as the basic algorithm framework, and the hierarchical structure between different features is realized by residual link method.

Construct a system of key and objective indicators for international communication of short videos
This section introduces and explains the methodology used for constructing the key objective indicator system for international communication of short videos.
Common communication studies are often based on information dissemination sequences or social networks only, which is difficult to model complex information dissemination effectively.In recent years, more and more scholars have combined communication models based on the social network with neural networks and achieved good results 13 .However, short video communication differs from traditional communication in that it lacks social attributes and is more determined by short video content, platform recommendation algorithms, and user behavior.Therefore, this paper will combine short video content, platform recommendation algorithm, and user behavior to select key objective indicators, build indicator system, and then conduct subsequent research.
The interpretation of key objective indicators (KOIS) is divided into two parts, key and objective.One meaning of key indicators is to play a key role in the communication of short videos, and another meaning is to facilitate scholars to study the indicators qualitatively or quantitatively in the actual research process; objective indicators refer to the indicators that are not influenced by human subjective factors, such as specific numbers, for example, the number of plays and likes, and specific descriptions, such as Chinese and English.Under the above definition, combined with the actual operation, the following key objective indicators that need to be used in this paper are summarized, as shown in Table 1.
After completing the selection of indicators, this paper continues to elaborate the hierarchical relationship of indicators.According to the dynamic and static properties of key objective indicators, the key objective indicators are horizontally divided into two categories of first-level indicators.

• Static indicator
The key objective indicators that cannot be changed after the release of short videos, including content type, production mode, presentation language and originality.

• Dynamic indicator
The key objective indicators that will change over time after the release of short videos, including likes, comments, downloads, shares, play growth rate, play growth acceleration and play volume.
According to the cause and effect relationship of dynamic indicators, dynamic indicators are horizontally divided into two types of secondary indicators.• Status indicator A dynamic indicator that does not evaluate the communication power of short videos directly.
• Outcome indicator A dynamic indicator that directly serve as the basis for evaluating the communication power of short videos, the play volume.
The reasons for using the play volume as an indicator to evaluate the communication power of short videos are as follows: A term similar to the concept of play volume is flow, which refers to the amount of exposure distributed by the servers to video.Play volume is the number of views.The behavior of the current audience on the short video platform is divided into two types, one is leisure and entertainment, and the other is shopping.
Video play volume = Exposure * Exposure click rate; Likes = Exposure * Exposure click rate * Video like rate; Live turnover = Exposure * Exposure click rate * Viewing-commodity click rate * commodity click-order rate * customer unit price; In addition, in the communication of short videos, almost all quantifiable indicators such as comments, downloads, shares and so on can be broken down into similar formulas, and all of them contain exposure.Ideally, exposure should be used as the ultimate indicator for evaluating the spread of short videos, but as mentioned in the previous words, this paper only studies the front end data which are easy to collect.Play volume is the closest one to exposure, so it will be used as the basis for evaluating the spread of short videos, i.e. the Outcome Indicator.
According to the elaboration of the above hierarchical relationship, a set of key objective indicators system with hierarchy for short video communication is constructed, as shown in Fig. 1.
Combining the index system and the temporal characteristics of short video communication, this paper constructs a short video communication network based on the key and objective indicators system of short video communication (KOIS-PN).The network structure is shown in Fig. 2.
The network in the figure can be understood that the status indicators and the static indicators jointly determine the short video communication results.Considering the temporal characteristics of short video   www.nature.com/scientificreports/communication, the short video communication results in the current moment are fed back to the status indicators in the future.It is easy to see that the structure of the graph is very similar to the basic framework of recurrent neural network, therefore, this paper adopts the approach based on Attention and Bi-LSTM neural network for the international communication research of international communication short videos.

AT-BiLSTM neural network model based on Attention and LSTM
The Attention-based Bi-LSTM model is actually the introduction of Attention into the Bi-LSTM framework.The structural framework designed in this paper is shown in Fig. 3. Following Fig. 3, x = (x 1 , x 2 , . . .x n ) is the input of the model.Bottom-up H from Fig. 3 is the Bi-LSTM neural network module.Solid lines from Fig. 3 denote forward LSTM layers and dashed lines from Fig. 3 denote reverse LSTM layers; in the output part of the Bi-LSTM neural network module, − → h n from Fig. 3 denotes the final state value of the forward LSTM in H, and a from Fig. 3 is the Attention probability of the information about all time positions for the final status of the forward LSTM, and y from Fig. 3 is the final output.Since the experiments conducted in this paper are playback predictions, we only focus on that final status of the forward LSTM in the two-layer Bi-LSTM, i.e.
− → h n .Attention(Q, K, v n ) which contains the − → h n association information with the history moment information and is the history moment information at any position, which alleviates the long-distance forgetting problem of Bi-LSTM.The specific calculation process is shown in Eqs. ( 1)-( 4): H denotes the input sequence x = (x 1 , x 2 , . . .x n ) after Embedding by Bi-LSTM module.W Q , W K , W v are the weight matrices.a contains h 1 , h 2 , . . .h n , the information on the degree of similarity between different location.v n contains the final status information of the forward LSTM in the Bi-LSTM.a with v n , the weight sum of the two is obtained as − → h n and h 1 , h 2 , . . .h n .And y is the final output.

Short video communication model combining KOIS-PN and AT-BiLSTM
This section introduces and explains the methodology used for developing the international communication network model.Among the key objective metrics of short videos, the model very easily ignores the static data due to the presence of static metrics, i.e., data that do not have temporal characteristics and are simply fed together into the time series model.Therefore, this paper will adopt a combination of the international communication short video communication network based on the short video communication key objective indicator system and the Attention-based Bi-LSTM model to study the international communication of short videos accordingly.The indicators in the system of Key and objective indicators constructed are specifically summarized into four categories: (1) Expanding the basic framework of the model yields the specific model structure shown as Fig. 5.The index of category I is denoted as F , and at the moment t , the input of the AT-BiLSTM module is denoted as x t .x t which contains all the representations except the predictors.The third category of indicators at the moment n is denoted as s n .h i is the output in the layer of the Bi-LSTM at the moment i .
− → h t is the final status in the layer of the forward LSTM at the moment t .y is the prediction result.The complete forward algorithm is as follows: (i) Calculation H t , G t  The output of the forward hidden layer of the Bi-LSTM is denoted as − → h t , and the output of the inverse hid- den layer is denoted as ← − h t .The hidden status of the network in this layer is denoted as H t , which is calculated as shown in Eqs. ( 5)- (12): , tanh is the hyperbolic tangent function, i t is the input gate, f t is the forgetting gate, o t is the output gate, c t is the cell status, W and b denote the weight matrix and bias vector of the network, ⊙ denotes the corresponding element multiplication.
(ii) Calculation Attention(Q, K, v n ) H denotes the input sequence x = (x 1 , x 2 , . . .x n ) after two layers of Bi-LSTM module Embedding.W Q , W K , W v are the weight matrices.a contains the similarity information between different locations of h 1 , h 2 , . . .h n .v n contains the final status of the forward LSTM in the Bi-LSTM.The weight sum of a and v n is obtained as the connection between − → h n and h 1 , h 2 , . . .h n , Attention(Q, K, v n ).

(iii) Calculation output y
This model has several features relative to the usual time series forecasting models: One is that the model framework is constructed based on the communication network.Previous algorithms in the middle of the neural network were interpreted so poorly that a deeper network had to be designed to solve the problem.The architecture involved in this paper is highly explanatory, and there are few models that combine metric networks with neural networks in the research field of short videos.Besides, based on this architecture, the model addresses the problem of forgetting non-time series in the time series model and the problem of smoother information flow by means of residual connection.
The other is that this paper combines the specific situation of short video communication and designs a network that incorporates the Attention mechanism into Bi-LSTM, so that after the input is Encode Embedding, the algorithm can learn the association between the final status and the arbitrary previous location, which overcomes the long-distance forgetting problem of Bi-LSTM.For example, if a short video with a very good trend suddenly stops playing, it is more likely that the short video has entered the background audit to suspend the distribution of traffic.In this case, the model should refer more to the information from more distant locations, in which case the module actually serves to filter out the anomalous information and enhance the robustness of the model.

Experiment
This section introduces and explains the specific techniques, algorithms, and data sources employed.
The coefficient of determination 17 , also known as the goodness of fit, is expressed as R 2 .The coefficient of determination reflects what percentage of the fluctuations in the dependent variable y can be described by the fluctuations in x.That is what percentage of the representation variation in the dependent variable y can be explained by the independent variable x.
The definition is shown in Eqs. ( 20)-( 23): In the equation, the total error is SS tot , the random error is SS res , y i is the predicted value, y i is the mean predicted value, f i is the true value, n is the number of samples.From the formula, it can be seen that the total error SS tot is caused for two reasons: the first one is the size of the output sequence length n and the second one is the random error caused by the degree of model merit.Obviously, the smaller the random error is, the larger the R 2 is, the closer it is to 1. R 2 is often used for performance testing of regression models.In general, the model can be considered as a good fit when R 2 > 0.8 .In this paper, R 2 is used as the evaluation algorithm of the model.

Dataset construction
In this paper, we selected 38 TikTok creators with recent and stable output, whose videos are all related to China and the content types are Chinese food or Chinese teaching.According to the short video communication key objective indicator system, the data to be crawled are like rate, comment rate, download rate, share rate, play growth rate, play growth acceleration and play volume.Starting from the release of the video, we crawled its likes, comments, downloads, shares and plays every 1 h.A total of 435,962 data and 2150 videos were crawled, with a size of about 80 M. The experimental data were collected in the time range of 2022.05.01-2022.8.17, and the specific data descriptions are shown in Table 2.  Experiment A takes all the Key and objective indicators except the outcome-playback as the input sequence and playback as the output sequence label.Label is the blue line drawn in the Fig. 6.The prediction result of Bi-LSTM is the red line drawn in the Fig. 6.The better the fit, the blue line and the red line are about close to a line; in the Fig. 7, the horizontal axis x is the true value, the vertical axis y is the predicted value, the orange dot indicates the value of y in the case that the true value is x, and the blue line indicates the predicted result of fitting to a straight line.The closer the blue line is to the diagonal y = x, the better the experimental result is.The right graph is more intuitive than the left graph.The coefficient of determination of the experimental results is R 2 = 0.38.
Experiment B fuses the short video communication network based on the key objective metric system with Bi-LSTM.The principle is the same as the communication model that combines the communication network based on the key objective indicator system with the Attention-based Bi-LSTM neural network model.All 11 Key and objective indicators are divided into four groups: the first group is static indicators with content type, production mode, presentation language, and originality; the second group is the number of likes, comments, shares, and downloads; the third group is the growth rate of play volume, the acceleration of play volume growth; the fourth group is play volume.The indicators in the first, second and third groups are input to the input layer of the model independently.The first and third sets of indicators are directly connected to the decision level through residual connections, and meet with the transformation results of the first three sets of indicators in the decision layer to jointly generate outputs.The structure is shown in Fig. 8.

Results
In order to be able to show the experimental results clearly, the real value of short video plays is presented separately as shown in Fig. 19.
We can summarize the following growth characteristics of play volume based on the growth curve of play volume: The play volume curve is monotonically increasing; The growth of play volume is non-linear; The growth of play volume is mainly concentrated in the period when the video is first released, and then the growth gradually slows down and gradually stabilizes.
The comparison of the experimental results of different methods is shown in Figs.20 and 21.
The comparison table of the results of R 2 is shown in Table3: (i) For the short video communication model with Bi-LSTM as the base framework, the performance is improved by about 113% by introducing KOIS-PN, a short video communication network based on the key objective metric system.The reason for the performance improvement analyzed in this paper is that KOIS-PN solves the applicability problem when the conventional LSTM model applied directly to short videos.For the conventional Bi-LSTM model, all features are equally entered into the model at once, regardless of the attributes of the features.For example, static indicator data as a kind of information that     does not change with time, it is obviously not reasonable enough to input it directly into the Bi-LSTM model that deals with time series and the experimental results also prove this idea that R 2 is only 0.38.The introduction of KOIS-PN is equivalent to treating the information of different attributes independently.
One is to prevent the single Bi-LSTM from losing non-time information, where the static indicator can be interpreted as a large bias; the other is to connect the play growth rate and the play growth acceleration, which are intuitively important metrics for decision making, so that the information can be delivered more smoothly.Overall, the introduction of KOIS-PN is actually for matching the model structure with the data characteristics.

Conclusion
Firstly, this paper constructs the international communication short video key objective indicator system, introduces the definition, selection and design logic of Key and objective indicators, and the idea about constructing the corresponding indicator system, and then constructs the international communication short video key objective indicator system and related data set from this.Secondly, this paper constructs the international short video communication network model.The construction of the international short video communication network is  The KOISPN-ATBiLSTM model proposed in this paper can not only realize the broadcast prediction of short videos, but also describe the mapping relationship from the Key and objective indicators of short videos' communication to the international communication effect.If we can specifically study the relationship between some of these indicators and the results, the model can be used to provide creation suggestions for the short videos with different types and different needs, so that it can be applied to the field of short video communication research.

Figure 1 .
Figure 1.Key and objective indicators system of international communication short video.

Figure 2 .
Figure 2. Short video communication network based on the key and objective indicators system of short video communication.

Figure 3 .
Figure 3. Structural framework of AT-BiLSTM neural network model.
14:14532 | https://doi.org/10.1038/s41598-024-65098-xwww.nature.com/scientificreports/Experimental analysis This section introduces and analyzes model performance, providing the strengths, limitations, and potential areas for improvement of the KOISPN-ATBiLSTM model.The experiments of Bi-LSTM-based short video broadcast prediction are divided into two groups.Group A is based on a single Bi-LSTM; Group B is based on the Bi-LSTM model and incorporates the Key and Objective Indicator System Based Communication Network (KOIS-PN) constructed in this paper.

Figure 7 .
Figure 7.The coefficient of determination of the experimental results of Bi-LSTM.

Figure 10 .
Figure 10.The coefficient of determination of the experimental results of KOISPN-BiLSTM.
, and the experimental results are shown in Fig. 18, R 2 = 0.84 .As seen in the above figure, the AT-BiLSTM model KOISPN-AT-BiLSTM, which incorporates the short video communication network based on the key objective indicator system, significantly outperforms the effect of the single AT-BiLSTM model.

Figure 12 .
Figure 12.The coefficient of determination of the experimental results of Attention.

Figure 14 .
Figure 14.The coefficient of determination of the experimental results of KOISPN-attention.

Figure 16 .
Figure 16.The coefficient of determination of the experimental results of AT-BiLSTM.
(ii) For the Attention-based neural network short-video communication model, after introducing KOIS-PN, a short-video communication network based on the key objective indicator system, the model performance is comparable to that of the unintroduced model.The results of this experiment show that a single Attention model does not exactly match the problem studied in this paper so this study will improve and optimize the model on the basis of Bi-LSTM.(iii) The performance of the short video communication model with AT-BiLSTM as the base framework is improved by about 91% after introducing KOIS-PN, a short video communication network based on the key objective indicator system.The analysis is similar to that in (i).(iv) The KOISPN-AT-BiLSTM model performs better than the KOISPN-BiLSTM model.The introduction of Attention makes the performance of LSTM-based models improve by about 3.7%, we analyze that this is because the Attention module can notice the information that cannot be learned by BiLSTM as well as the models incorporating KOISPN.For example, in Chapter II we discussed that there is a large amount of human intervention in the recommendation mechanism of short videos.If the video playback growth is suspended due to the entry audit, the conventional model may assume that the future playback of the video tends to be flat.This is actually the long-range forgetting problem of LSTM, and the introduction of Attention can alleviate this problem.

Figure 19 .
Figure 19.True value of short video plays.

Figure 18 .
Figure 18.The coefficient of determination of the experimental results of KOISPN-AT-BiLSTM.

Figure 20 .
Figure 20.Comparison of the experimental results of six groups of models (I).

Figure 21 .
Figure 21.Comparison of the experimental results of six groups of models (II).

Table 1 .
Key objective indicators of short video dissemination.